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ABSTRACT 

The JWST Early Release Observations (ERO) included a NIRISS/SOSS (0.6-2.8 um) transit of the ~ 850K Saturn-mass 
exoplanet HAT-P-18 b. Initial analysis of these data reported detections of water, escaping helium, and haze. However, active 
K dwarfs like HAT-P-18 possess surface heterogeneities — starspots and faculae — that can complicate the interpretation of 
transmission spectra, and indeed, a spot-crossing event is present in HAT-P-18 b’s NIRISS/SOSS light curves. Here, we present 
an extensive reanalysis and interpretation of the JWST ERO transmission spectrum of HAT-P-18b, as well as HST/WFC3 
and Spitzer/IRAC transit observations. We detect H20 (12.5 7), CO2 (7.3.0), a cloud deck (7.4 o), and unocculted starspots 
(5.8 ©), alongside hints of Na (2.7 o). We do not detect the previously reported CH4 (log CH4 < -6 to 2 o). We obtain excellent 
agreement between three independent retrieval codes, which find a sub-solar H2O abundance (log H20 ~ —4.4 + 0.3). However, 
the inferred CO7 abundance (log CO2 ~ —4.8 + 0.4) is significantly super-solar and requires further investigation into its origin. 
We also introduce new stellar heterogeneity considerations by fitting for the active regions’ surface gravities — a proxy for the 
effects of magnetic pressure. Finally, we compare our JWST inferences to those from HST/WFC3 and Spitzer/IRAC. Our results 
highlight the exceptional promise of simultaneous planetary atmosphere and stellar heterogeneity constraints in the era of JWST 
and demonstrate that JWST transmission spectra may warrant more complex treatments of the transit light source effect. 


Key words: planets and satellites: atmospheres — planets and satellites: gaseous planets — planets and satellites: individual: 
HAT-P-18 b — stars: starspots — methods: data analysis — techniques: spectroscopic 


1 INTRODUCTION 


In the works for over two decades, the James Webb Space Tele- 
scope (JWST) is finally operational. Astronomers can now count on 
* E-mail: marylou.fournier.tondreau @umontreal.ca space instruments with modes designed to study exoplanetary atmo- 
+ NHFP Sagan Fellow spheres. Transmission spectroscopy is a commonly used method to 
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$ Trottier Postdoctoral Fellow reveal the composition and structure of an atmosphere and, there- 
{ Banting Postdoctoral Fellow fore, to enable inferences about a planet’s formation and evolution 
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whereas some light passes through the planetary atmosphere and can 
introduce measurable absorption features (Seager & Sasselov 2000; 
Brown 2001). The resulting transmission spectrum can reveal key in- 
formation regarding the abundance of molecular and atomic species 
and the presence of clouds and hazes (e.g., Charbonneau et al. 2002; 
Tinetti et al. 2007; Wakeford & Sing 2015; Sing et al. 2016). 

Astronomers have faced many challenges when conducting atmo- 
spheric studies with the Hubble (HST) and Spitzer Space Telescopes 
since neither observatory was designed for exoplanet observations. 
Numerous technical difficulties, instrument systematics, as well as 
narrow wavelength range and spectral resolution, have limited atmo- 
spheric inferences. For example, observations with the Wide Field 
Camera 3 (WFC3) instrument aboard HST are complicated by sys- 
tematic trends, such as “HST breathing” effects, visit-long slopes, 
and the “ramp” effect (Wakeford et al. 2016). Nevertheless, the ef- 
forts and ingenuity of scientists have led to astonishing discoveries, 
such as the detection of several chemical species, including water 
vapour on hot Jupiters (e.g., Tinetti et al. 2007; Tsiaras et al. 2018) 
and even on a sub-Neptune (e.g., Benneke et al. 2019b), the infer- 
ence of clouds and hazes in several gas giants (e.g., Wakeford & Sing 
2015; Sing et al. 2016), and the observation of atmospheric escape 
on hot Neptune-like exoplanets (e.g., Ehrenreich et al. 2015; Spake 
et al. 2018). 

HAT-P-18b was discovered in 2010 by Hartman et al. (2011) 
using the Hungarian-made Automated Telescope Network. It is of 
approximately Saturn mass (M = 0.197 Mj), but with an inflated 
radius (R = 0.995 Ry), due to its higher temperature (Teq = 852 K) 
relative to the Solar System giants, which is a consequence of the 
planet’s comparatively short orbital period (P = 5.5 days). Ground- 
and space-based transmission spectroscopy has been performed on 
this target. Kirk et al. (2017) suggested a high-altitude haze con- 
sistent with the detection of Rayleigh scattering and the absence 
of the sodium absorption feature using the Auxiliary-port CAM- 
era (ACAM) instrument on the William Herschel Telescope (WHT). 
Tsiaras et al. (2018) detected the presence of water vapour and a grey, 
opaque cloud deck in HAT-P-18 b’s atmosphere using HST/WFC3, 
reporting a water abundance of log H20 = —2.63 + 1.18 and a cloud- 
top pressure of log Pejoug [Pa] = 2.82 + 0.91. Fu et al. (2022) pre- 
sented an analysis of the transit observed in the Single Object Slitless 
Spectroscopy (SOSS) mode (Albert et al. 2023) of the Near Infrared 
Imager and Slitless Spectrometer (NIRISS) instrument (Doyon et al. 
2023) on board the JWST and detected water (with an abundance 
of log H20 = -3.03+031), hints of methane (Alog Z = 3.79, or a 
3.2 0 confidence), as well as excess helium absorption and tail in an 
otherwise very hazy atmosphere. 

HAT-P-18 is an active K dwarf with an effective temperature of 
4803 K and a slightly super-solar metallicity ([Fe/H] = 0.10 + 0.08; 
Hartman et al. 2011). Stellar active regions, such as starspots and 
faculae, can introduce spectral features in transmission spectra that 
overlap those of exoplanetary atmospheres. Occulted active regions 
were often masked when fitting transit models to spectroscopic light 
curves; however, the impact of the occulted spot on the transmission 
spectrum is still present despite that (e.g., Oshagh et al. 2014; Bixel 
et al. 2019). This can also lead to a biased transmission spectrum 
by impacting not only the transit depth but possibly the mid-transit 
time, the scaled semi-major axis, and the impact parameter (e.g., 
Barros et al. 2013; Alexoudi et al. 2020). Recent studies have moved 
towards joint inferences of transit and active region properties (e.g., 
Bixel et al. 2019; Espinoza et al. 2019a). The NASA Study Analysis 
Group on the effect of stellar contamination on space-based transmis- 
sion spectroscopy (SAG 21) of the Exoplanet Exploration Program 
Analysis Group (ExoPAG) recommends performing these joint in- 
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ferences with future observations instead of masking active region 
occultations (Rackham et al. 2023). 

Unocculted stellar active regions have long been recognized as 
a significant obstacle to exoplanet transmission spectroscopy. Early 
HST studies recognized that unocculted cool starspots can cause 
strong transit depth slopes towards short visible wavelengths (e.g., 
Pont et al. 2007, 2013; McCullough et al. 2014). Similarly, unoc- 
culted hot faculae can imprint a negative slope in transmission spectra 
(e.g., Rackham et al. 2018; Kirk et al. 2021). The physical origin of 
this ‘stellar contamination’ is a mismatch between the intensity of the 
stellar surface sampled by the planet during transit and the average 
spectrum of the star (including the photosphere, starspots, and facu- 
lae). Since the contrast ratio between the spectra of stellar regions at 
different temperatures increases as shorter wavelengths, this ‘transit 
light source effect’ (TLSE; Rackham et al. 2018; Barclay et al. 2021) 
is wavelength-dependent and more significant at visible wavelengths. 
The most common approach to deal with the TLSE in early studies 
was to correct the transmission spectrum based on activity moni- 
toring or occulted starspot properties (e.g., Pont et al. 2008; Berta 
et al. 2011; Sing et al. 2011). Barstow et al. (2015) demonstrated that 
not accounting for starspots when modelling transmission spectra of 
giant planets can bias retrieved molecular abundances, while (Rack- 
ham et al. 2018) further showed that the TLSE can dominate over 
absorption features for terrestrial planets. Moran & Stevenson et al. 
2023 provide a recent example of this prediction, finding degenerate 
interpretations between unocculted starspots and atmospheric H2O 
for JWST observation of a terrestrial exoplanet. 

Recent years have seen a renewed focus on incorporating unoc- 
culted stellar regions into atmospheric retrieval codes, allowing si- 
multaneous inferences of stellar and atmospheric properties. Pinhas 
et al. (2018) developed a transmission spectrum retrieval framework 
to jointly fit a single population of unocculted stellar heterogeneities 
and a planetary atmosphere. Subsequent retrieval studies have ex- 
plored the fidelity of TLSE retrieval assumptions (e.g., Iyer & Line 
2020; Thompson et al. 2023; Rackham & de Wit 2023) and applied 
these joint retrievals to interpret observations from HST and Spitzer 
(e.g., Bruno et al. 2020; Rathcke et al. 2021), ground-based tele- 
scopes (e.g., Bixel et al. 2019; Jiang et al. 2021; Kirk et al. 2021), 
and JWST (Moran & Stevenson et al. 2023). However, the SAG 21 
report (Rackham et al. 2023) notes that there is considerable scope 
to improve the realism and complexity of retrieval prescriptions for 
unocculted stellar active regions. 

In this work, we aim to disentangle stellar and planetary atmo- 
sphere signals by including stellar heterogeneities in transit fits and 
atmospheric retrievals. We present and compare two independent at- 
mospheric reanalyses of HAT-P-18 b, one using JWST NIRISS/SOSS 
Early Release Observations (ERO) transit observation and another 
combining transit observations from HST/WFC3 and the Infrared Ar- 
ray Camera (IRAC) of Spitzer. We describe the observations and data 
reduction approach in Section 2 and detail our JWST NIRISS/SOSS 
light curve fitting and occulted starspot analysis in Section 3. Sec- 
tion 4 describes our joint stellar heterogeneity and atmospheric re- 
trieval method and presents results from three independent retrieval 
codes. We summarize and discuss our results in Section 5 and con- 
clude in Section 6. 


2 OBSERVATIONS & DATA REDUCTION 


The scientific legacy of HST and Spitzer is considerable, particularly 
for exoplanet studies; JWST will, in many ways, build on this legacy. 
In this work, we present transit observations with JWST and its pre- 


Table 1. Parameters of the HAT-P-18 planetary system used in this analysis 


Parameters HAT-P-18 Units 
Stellar parameters 

Spectral type K2V 

Stellar radius 0.749 + 0.037 Ro 
Stellar mass 0.770 + 0.031 Mo 
Metallicity 0.10 + 0.08 [Fe/H] 
Stellar surface gravity 4.57 + 0.04 log jg cm/s? 
Effective temperature 4803 + 80 K 


Planetary and transit parameters 


Planet radius 0.995 + 0.052 Ry 
Planet mass 0.197 + 0.013 My 
Orbital period 5.508023 + 0.000006 day 
Eccentricity 0.084 + 0.048 

Argument of periastron 120.0 + 56.0 deg 
Impact parameter 0.32470. 05) 

Scaled semi-major axis 16.04 + 0.75 

Transit duration 0.1131 + 0.0009 day 
Scaled planet radius 0.1365 + 0.0015 

Equilibrium temperature 852 + 28 K 


Note: Parameters from Hartman et al. 2011 


decessors, HST and Spitzer, to show the potential of NIRISS/SOSS to 
characterize exoplanetary atmospheres and to cope with challenges 
such as stellar contamination. 


2.1 Observations 
2.1.1 JWST NIRISS/SOSS 


HAT-P-18b was observed in transit using the SOSS mode of the 
NIRISS instrument (Albert et al. 2023; Doyon et al. 2023) on board 
JWST as part of the ERO program (PI: Klaus M. Pontoppidan; Pon- 
toppidan et al. 2022). The time series observation (TSO) started on 
June 13th, 2022, at 04:36:50.861 UTC and lasted 7.15 hours, which 
covered the 2.7 hours transit as well as some baseline before and after 
the transit. The GR700XD grism and the CLEAR filter were used, 
along with the SUBSTRIP256 subarray, which captures both diffrac- 
tion orders 1 and 2. There are 469 integrations, each consisting of 
9 groups and lasting 54.94 seconds. In addition, a second exposure 
with the GR700XD grism and F277W filter was taken directly after 
the end of the main CLEAR filter TSO. This exposure had the exact 
same readout parameters as the CLEAR exposure, except it consisted 
of only 11 integrations lasting 10.07 minutes. The TSO was previ- 
ously published by Fu et al. (2022), the major findings of which are 
summarized in Section 1. 


2.1.2 HST/WFC3 + Spitzer/IRAC 


Transit observations of HAT-P-18 b were obtained using HST/WFC3 
with the G141 grism in the spatial scan mode. These come from two 
visits, consisting of five orbits each, made as part of the Cycle 23 HST 
General Observer campaign (PI: Drake Deming; Deming et al. 2015) 
in February 2016 and January 2017. These observations spanned the 
wavelength range from 1.1 to 1.7 um and were downloaded from 
the Mikulski Archive for Space Telescopes (MAST). The raw light 
curves show characteristic, non-linear systematics because of thermal 
effects induced by the telescope’s 96-minute orbit (Foster et al. 1995). 
This leads us to follow standard practice (e.g., Benneke et al. 2019b) 
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and discard from our analysis the first orbit as well as the first two 
measurements in each subsequent orbit. This leaves one orbit on 
either side of the transit and two orbits corresponding to the transit 
itself, including significant portions of the ingress and egress. These 
HST transits were previously published by Tsiaras et al. (2018). 

Transit observations with Spitzer come from two visits in 2013 
using the IRAC instrument in the 3.6 and 4.5 um channels (PI: Jean- 
Michel Désert; Desert et al. 2012). The data were obtained from the 
Spitzer Heritage Archive. 


2.2 Data Reduction 
2.2.1 JWST NIRISS/SOSS Data Reduction 


We reduced the NIRISS/SOSS TSO using the supreme-SPOON 
pipeline (Feinstein et al. 2023; Coulombe et al. 2023; Radica et al. 
2023; Lim et al. 2023), starting from the raw, uncalibrated files down- 
loaded from the MAST archives. Some steps in the supreme-SPOON 
pipeline are handled by the official JWST pipeline. We follow a 
nearly identical procedure for the reduction as was presented in Fe- 
instein et al. (2023) and Radica et al. (2023): we first process the 
TSO through supreme-SPOON Stage 1, which performs the detector 
level calibrations. This includes, in particular, the subtraction of the 
column-correlated 1/f noise, during which we ensure to properly 
mask several undispersed (zeroth order) as well as the sole dispersed 
(likely a second order, slightly above the target third order) contami- 
nants of background field stars, and the ramp fitting to convert from 
3D “ramp” to 2D “slope” images. 

Stage 2 of supreme-SPOON performs further reductions on the 
slope-level products, including the background subtraction, for which 
we use the SOSS SUBSTRIP256 background model provided by the 
Space Telescope Science Institute (STScI)!, and tracing of the target 
orders to define the extraction boxes as well as the stability of the 
target trace (changes in x and/or y position, changes in morphology, 
etc.) over the course of the TSO. For further details on these steps, 
please see Radica et al. (2023). We note, though, that using a constant 
scaling of the STScI background model, such as in Radica et al. 
(2023), did not perfectly remove the background step. We, therefore, 
separately scaled the STScI model blueward and redward of the step 
(e.g., Lim et al. 2023). We find values for the best-fitting background 
scaling to be 0.4384 and 0.4080 for pre- and post-jump, respectively. 

We extract the stellar spectra using the ATOCA algorithm (Darveau- 
Bernier et al. 2022) using a specprofile reference file created 
specifically for this observation using the APPLESOSS code (Radica 
et al. 2022). ATOCA explicitly takes into account the overlap between 
the first and second orders of the target spectrum on the detector dur- 
ing the extraction, though the expected dilution introduced from this 
overlap is predicted to be near-negligible for relative measurements 
such as exoplanet transmission spectroscopy (Darveau-Bernier et al. 
2022; Radica et al. 2022). 

We use the updated APPLESOSS v2.0.0 in this work. The ini- 
tial version of APPLESOSS presented in Radica et al. (2022) used 
the WebbPSF package (Perrin et al. 2014) to simulate the extended 
wings of the SOSS PSF (Point Spread Function). However, due to 
concerns that the simulated PSFs underpredict the SOSS wings’, we 
update the APPLESOSS framework to use the wings of order 1 profiles 


! https://jwst-docs.stsci.edu/jwst-calibration-pipeline-c 
aveats/jwst-time-series-observations-pipeline-caveats/nir 
iss-time-series-observations-pipeline-caveats#NIRISSTimeS 
eriesObservationsPipelineCaveats-SOSSskybackground 

2 JWST Technical Report JWST-STScI-008270 


MNRAS 000, 1-20 (2023) 


4 Marylou Fournier-Tondreau et al. 


25000 
22500 
20000 Q 
O 
17500 © 
= 
15000 Wi 
pe 
12500 U 
Z 
10000 — 


7500 


| o 
(NG) sjuno5 


Spatial Pixel 


I 
A 
o 


ul 


w 


(S/NG) sjunoD 


1000 1250 
Spectral Pixel 


Figure 1. Data products at different stages of the reduction process. (A): A raw, uncalibrated data frame in data numbers (DN). (B): Data frame after superbias 
subtraction and reference pixel correction. (C): 1/f noise. (D): After ramp fitting and flat field correction. (E): Final calibrated data product after background 
subtraction and bad pixel correction. 
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bluewards of ~ 1.1 ym where there is no contribution from order 2 
on the detector. We note, however, that the transmission spectrum 
is unchanged whether simulated or empirical wings are used, as the 
strength of the self-dilution is significantly smaller than the transit 
depth precision. Finally, after the extraction, we clip any data points 
which deviate by more than 5 o from their neighbours in the time 
direction; < 2% of points are rejected in this way. We find no signifi- 
cant deviation in the target trace position with the positions included 
in the spectrace reference file>, and we, therefore, use the default 
SOSS wavelength solution. A summary of the major reduction steps 
is shown in Figure 1. 

HAT-P-18 has a co-moving white dwarf companion separated by 
only ~ 2.66” at a position angle of ~ 186°, as revealed by GAIA 
astrometry (Mugrauer & Michel 2021)*. Given the aperture position 
angle of the SOSS observation (252.09°), the spectral trace of this 
companion is offset by (—9, +40) pixels relative to the trace of HAT- 
P-18, i.e., mostly out of the extraction aperture throughout order 1 
and for all wavelengths < 0.85 um at order 2. Combined with its 
measured contrast of 8 mag at 1.4 um (Mugrauer & Michel 2021), 
it has a negligible effect on the flux extracted for HAT-P-18. No 
particular action was thus taken to deal with it. 


2.2.2 HST/WFC3 + Spitzer/IRAC Data Reduction & Light Curve 
Fitting 


Following standard procedure for HST/WFC3 observations (e.g., 
Deming et al. 2013; Benneke et al. 2019b), we perform the nec- 
essary data reduction and light curve fitting by using the modular 
Exoplanet Transit Eclipses and Phase curves (ExoTEP) framework 
(Benneke et al. 2017, 2019a), which employs the batman transit 
light curve model (Kreidberg 2015). Using a Markov chain Monte 
Carlo (MCMC) method with the emcee package (Foreman-Mackey 
et al. 2013), ExoTEP jointly fits the transit and systematics mod- 
els for the two visits along with photometric noise parameters. We 
construct spectrophotometric light curves by spectrally binning at 
30 nm intervals (Kreidberg et al. 2014) the extracted flux for each 
visit. This binning was chosen because it offered the best trade-off 
between the number of data versus error bar size, as compared to 
10 and 60 nm bin widths. We follow Benneke et al. (2019b) for the 
procedure of the white and spectrophotometric light curve fits. The 
latter for both visits are shown in Figure B1, whereas the best-fitting 
values from the white light curve fit are quoted in Table Al. The 
resulting HST/WFC3 transmission spectrum is displayed in Figure 
3. 

We follow standard procedure for Spitzer/IRAC image processing 
(e.g., Benneke et al. 2019b). As was done for the HST data, we em- 
ploy ExoTEP to reduce the data further and fit the Spitzer light curves. 
After correcting for the Spitzer-specific systematics, we obtain a fi- 
nal light curve for each of the two channels as displayed in Figure 
B2. We used 80-second bins for the time axis due to the relatively 
high cadence of the Spitzer data. Compared to the HST spectropho- 
tometric light curves, we find that the scatter is higher, resulting in 
larger errors for the Spitzer data points in the combined transmission 
spectrum. We kept the transit depths from Spitzer/IRAC in the HST 
transmission spectrum in order to better constrain the atmospheric 
retrievals. 


3 spectrace reference file jwst_niriss_spectrace_0023.fits was 


used in this work. 
4 This companion was also previously reported as a candidate by Ginskiet al. 
(2016) based on lucky imaging. 
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3 JWST NIRISS/SOSS LIGHT CURVE FITTING & 
OCCULTED STARSPOT ANALYSIS 


3.1 Light Curve Fitting with Spot-Crossing Masked 


For the SOSS data, we construct separate white light curves for orders 
1 and 2 by summing the flux from all wavelengths in order 1 (0.85— 
2.8 um) and from 0.6—0.85 um in order 2. We then fit a transit model 
to each white light curve using the juliet package (Espinoza et al. 
2019b). We fix the orbital period to 5.508023 d, the eccentricity to 
0.084, and the argument of periastron to 120° (Hartman et al. 2011; 
all their values used in this paper are listed in Table 1), and put wide, 
uninformative priors on the time from mid-transit, Tọ, the impact 
parameter, b, the scaled orbital semi-major axis, a/ R+, and the scaled 
planet radius, Rp/R+. We fit two parameters of the quadratic limb 
darkening law following the parameterization of Kipping (2013), a 
scalar jitter term, o, which replaces the flux errors reported by the 
reduction pipeline, and finally, a term to fix the zero point of the 
transit baseline. Unlike previous SOSS TSOs (Feinstein et al. 2023; 
Radica et al. 2023), we find that no detrending is necessary, as the 
white light curves for both orders are best fit by a transit model 
with no additional systematics models: Alog Z = 26.1, or a >50 
preference by the Benneke & Seager (2013) scale. 

There is a spot-crossing event clearly visible just after the time 
from mid-transit. Here we do not include a model of this occulted 
starspot and instead mask the integrations associated with the spot- 
crossing (integrations 240 — 270). Therefore, we fit eight parameters 
for each order. The reduced Chi-squared statistic for the fits is x = 
1.08 for order 1 and 1.06 for order 2. The best-fitting transit models 
for orders 1 and 2 are overplotted in Figure 2. 

We then proceed to fit the spectrophotometric light curves at the 
pixel level — that is, we fit one light curve per pixel column on the 
detector. This results in 2038 light curves for order 1 and 567 for order 
2. For these fits, we fix the orbital parameters and the transit baseline’s 
zero point to their best-fitting values from the order 1 white light curve 
fit because of the better signal-to-noise (S/N). We leave only the 
scaled planet radius, limb-darkening parameters, and jitter term free. 
We put Gaussian priors on the limb-darkening parameters based on 
calculations from the ExoTiC-LD package (Wakeford & Grant 2022) 
using the 3D stagger grid (Magic et al. 2015). The widths of the 
Gaussian priors are set to 0.1 (e.g., Pontoppidan et al. 2022; Espinoza 
et al. 2022). We tested larger widths of 0.2 following Patel & Espinoza 
(2022) and the retrieved transmission spectrum is consistent within 
less than 10>. We also experimented with freely fitting the limb 
darkening coefficients and found that this results in a slight (~20 ppm) 
offset to the spectrum while preserving the relative amplitudes of the 
spectral features. We again mask the integrations associated with 
the spot-crossing in each fit. The resulting transmission spectrum is 
displayed in Figure 3. 

Although we do not directly use the F277W exposure for science 
purposes in this study, as in Radica et al. (2023), it is useful to 
pinpoint undispersed (order 0) contaminants of field stars. Several 
bright contaminants are clearly visible in the F277W data frame, 
many of which are also readily visible in the CLEAR frame. However, 
unlike in Radica et al. (2023), we are unable to post-process our 
transmission spectrum to correct for the diluting effects of field star 
contamination. The contaminants are too bright preventing good 
reconstruction of the target trace. Moreover, as pointed out by Fu 


5 We also tested a retrieval on that transmission spectrum, and all retrieved 
parameters are consistent within 1 o to those retrieved with the main trans- 
mission spectrum (see Section 4). 
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et al. (2022), one contaminant is time-varying. We, therefore, mask 
the affected regions of the transmission spectrum as was done in 
Feinstein et al. (2023) and Fu et al. (2022). The wavelength regions 
masked in this way are as follows: 0.714 — 0.724 um and 0.841 — 
0.85 pm in order 2, and 0.853 — 0.870 pm, 1.048 — 1.061 pm, 1.366 
— 1.384 pm, and 1.972 — 2.011 ym in order 1. A frame of the F277W 
exposure, highlighting the masked regions is shown in Figure B3. 

We also produce a transmission spectrum using an independent 
reduction pipeline, NAMELESS (Feinstein et al. 2023; Coulombe et al. 
2023; Radica et al. 2023), as shown in Figure A1, to verify the self- 
consistency of our results (see Appendix A). The best-fitting transit 
parameters from NIRISS/SOSS white light curve fits are shown in 
Table Al. 


3.2 Occulted Starspot Method 


We also perform a second light curve fit, enabling a joint inference 
of the starspot and planet properties to model the spot and study its 
impact on the transmission spectrum. To this end, we first compute a 
single broadband light curve by summing the flux from wavelengths 
0.65-0.85 um of order 2 together with wavelengths 0.85—1.5 um of 
order 1. We exclude wavelengths bluewards of 0.65 um to maximize 
the S/N and wavelengths redwards of 1.5 um because the effect of 
spot crossings is stronger at shorter wavelengths where the spot con- 
trast with respect to the stellar photosphere is larger. We fit a transit 
model with a spot-crossing event using spotrod (Béky et al. 2014), 
which we have implemented into the juliet package (Espinoza 
et al. 2019b). We fix the period, the eccentricity and the argument of 
periastron to the Hartman et al. (2011) values and fit the mid-transit 
time, the impact parameter, the scaled semi-major axis, the scaled 
planet radius, the two parameters of the quadratic limb darkening 
law, and the jitter term with wide, uninformative priors. We also fit 
the zero point of the transit baseline, though the best-fitting value is 
again very close to zero. We fit four additional parameters to model 
the starspot: the x- and y-positions of the spot, its radius, and its spot- 
to-stellar flux contrast. The positions and the radius of the spot are in 
stellar radii units. The centre of the star is at (0, 0). We use uniform 
priors ranging from 0 to 1. We employ dynesty (Speagle 2020) 
to sample the parameter space with 5000 live points. The reduced 
Chi-squared statistic for the fit with the highest likelihood is yy = 
1.08. There is strong evidence (A log Z = 95.7, or a > 5 o preference) 
for a transit model with an occulted spot. The best-fitting parameters 
for the broadband light curve fit with the spot-crossing modelled are 
displayed in Table Al. The choice of wavelength range (0.65-1.5 
pm) results in best-fitting values that are more precise and overall 
consistent at 1 ø to the best-fitting values with the entire wavelength 
ranges of order 1 and 2. We note that the y-position of the spot is 
not well constrained, as shown in the corner plot in Figure B4, so 
we use the set of parameters with the highest likelihood instead of 
the medians of the posteriors from the broadband light curve fit, to 
ensure that the parameters are self-consistent. This best-fitting transit 
model is overplotted in the top panel of Figure 4, and a cartoon of 
the spot-crossing event is shown in the bottom panel. 

We then fit the spectrophotometric light curves at a resolving power 
of R = 100, fixing the orbital parameters (Tọ, b, a/R), as well as the 
spot positions and radius from the above broadband light curve fit. 
We fit the scaled planet radius, the limb-darkening parameters, the 
spot contrast and the jitter coefficient with the priors described above. 
The spectrophotometric light curves for 14 bins with their best-fitting 
transit models are shown in Figure 5 for the highest likelihood model. 
The spot contrast is correlated with the y-position (see Figure B4) 
because there is a degeneracy between the radius and the contrast of 
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the spot given the strong correlation between the spot temperature and 
filling factor® (Bruno et al. 2022). To capture the uncertainties coming 
from the degeneracies in the spot y-position, size and contrast, we 
repeat the fit of the spectrophotometric light curves 50 times by fixing 
the same parameters (To, b, a/R», spot positions and radius) to 50 
random sets taken from the posterior samples of the broadband light 
curve fit. We draw the 50 sets from the 68 % of the weighted samples 
with the highest likelihoods. 

We next constrain the temperature of the occulted spot by fitting 
PHOENIX stellar atmosphere model spectra (Husser et al. 2013) 
to the spot contrast spectrum resulting from the spectrophotometric 
light curves fits. Starspots’ spectra are thought to be represented by 
stellar models with lower temperatures and 0.5—1 dex lower log g 
than the host star. The increased magnetic pressure in active features 
decreases the gas pressure (Solanki 2003; Bruno et al. 2022), which 
is akin to a stellar model with a lower surface gravity. In fitting for 
the spot temperature, we thus also left the spot surface gravity to 
vary as a free parameter. For the spot, we use stellar model spectra 
with temperatures from 4000 to 5000 K, logarithmic surface gravities 
from 1 to 5 dex and a fixed metallicity of 0.1 from Hartman et al. 
(2011). We interpolated these spectra in temperature and log g as 
needed during the fitting process. For the star, we use a stellar model 
spectrum with the temperature, log g and metallicity fixed to the 
corresponding values from Hartman et al. (2011). The model spot 
contrast spectrum is then simply the ratio of the model spot spectrum 
to the model star spectrum. We fit the temperature of the spot and 
the log g with wide, uninformative priors using the encee MCMC 
package (Foreman-Mackey et al. 2013). 


3.3 Inferred Occulted Starspot Properties on HAT-P-18 


From the aforementioned process, we obtained 51 starspot contrast 
spectra from the 51 spectrophotometric fits performed: one from 
the highest likelihood values and 50 with random draws from the 
broadband light curve fit. Each model has its own set of parameters 
(i.e., To, b, a/R», spot positions and radius). The resulting density of 
spot contrast spectra is shown in Fig. 6. By inspecting the individual 
contrast spectra, we see that 28 of the 50 random fitting models 
(56%) have a retrieved contrast spectrum within 1o of the one 
retrieved with the highest likelihood set of parameters. We lump 
these solutions together as the “most likely” one (blue model in 
Figure 6). The mean and standard deviation of the spot x- and y- 
position, radius, temperature, and the model log g are given in Table 
2 for this group of solutions. This corresponds to a spot radius of 
0.116 + 0.014 stellar radii, a temperature AT = -93 + 15 K colder 
than the star (or Tspot = 4710 + 15 K) and a lower log g by 1.16 + 
0.19 dex. The retrieved log g for the starspot model is consistent with 
Bruno et al. (2022) expectation. 

Among the remaining contrast spectra, those not within 1 o of the 
one with the highest likelihood, we can identify three other broad 
families of solutions corresponding to a colder spot because of a 
smaller filling factor; see Table 2. The transmission spectra for these 
four families of solutions do not significantly change (they are con- 
sistent at the 1 ø level) from the transmission spectrum obtained with 
the spot occultation features masked (as done in Section 3.1). This is 
expected for a spot of that size and temperature, as the TLSE from 
occulting such a spot results only in a small slope toward bluer wave- 
lengths in the transmission spectrum, as shown in Figure B5. This 


6 The filling factor corresponds to the fraction of the planetary disc occulting 
the active region. 
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Figure 2. Top: NIRISS/SOSS white light curves for order 1 (left) and order 2 (right) with the best-fitting transit model overplotted in black. The integrations 
associated with the spot-crossing are shown in beige. The fit statistics are indicated for each order: Xz; the reduced Chi-squared, o`; the average error bar, and 
e; the error multiple to obtain a y2 equal to unity. Middle: Residuals to the transit fits. The RMS scatter in the residuals is indicated for each order. Bottom: 
Histogram of residuals. 
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Figure 3. Transmission spectra of HAT-P-18 b; one with JWST NIRISS/SOSS at pixel resolution (faded grey) and binned to a resolving power of R = 100 (blue) 
and another with HST/WFC3 (red). Note that no offset was applied to the HST spectrum. 


effect, estimated to be of the order of 10 to 20 ppm from Equation 4 
in Section 4, is smaller than our transit depth uncertainties. 


4 RETRIEVAL ANALYSIS 


We now present the inferences from our retrieval analysis of HAT-P- 
18 b’s transmission spectrum with NIRISS/SOSS (from Section 3.1). 
Our retrievals jointly model the influence of the planetary atmosphere 
and unocculted stellar heterogeneities on the transmission spectrum 
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Table 2. Families of solutions for the occulted starspot 


; Fraction x-position y-position size AT Alog g 
Solutions of models [Re] [Ra] [Ra] [K] [logio cm/s?] 
Most likely 0.56 0.090 + 0.005 0.42 +0.05 0.116 + 0.014 93 +15 1.16 + 0.19 
Colder spot 
Similar position 0.18 0.092 + 0.003 0.41 +0.07 0.090 + 0.004 140 +20 1.8 + 0.2 
Higher position 0.14 0.088 + 0.004 0.53 +0.03 0.125+0.018 180+ 50 2.1 + 0.4 
Lower position 0.12 0.091 + 0.005 0.24 + 0.05 0.14 + 0.03 190 + 80 2.2 + 0.6 
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Figure 4. Modelling of the spot-crossing event. Top: NIRISS/SOSS broad- 
band light curve (blue), along with the best-fitting model overplotted (black) 
and the fit statistics listed in the bottom left corner (x2; the reduced Chi- 
squared, o`; the average error bar, and e; the error multiple needed to obtain a 
x equal to unity). Middle: Residuals to the transit fit with the RMS scatter. 
Bottom: Physical representation of the maximum likelihood solution for the 
occulted starspot (black circle) on the star (yellow circle), along with the 
transit motion in black (dashed lines representing the transit chord) of the 
planet (orange circle). 
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Figure 5. JWST NIRISS/SOSS binned spectrophotometric light curves, along 
with the best-fitting transit models using spotrod (black). Left: Normalized 
spectrophotometric light curves at a resolving power of R = 100. Right: 
Associated residuals from the light curve fit in each bin with the RMS scatter 
indicated. 
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Figure 6. Density of contrast spectra from the 50 models fit, along with the 
most likely model in blue and a colder solution model in green (the one with 
a similar spot position). The model in red represents the best-fitting model to 
the highest likelihood contrast spectrum if we only fit the spot temperature, 
fixing the surface gravity to the stellar value. The Chi-squared statistic for this 
fit is y? = 193 instead of y? = 156 when the spot model’s surface gravity is 
also fit. 


and thus yield simultaneous constraints. We conducted this analysis 
with three independent retrieval codes to ensure robust results. 

In what follows, we first describe our approach for joint retrievals of 
a planetary atmosphere and stellar contamination. We then describe 
the setup of our three retrieval codes before presenting our resulting 
constraints on HAT-P- 18 b’s atmosphere and unocculted stellar active 
regions. Finally, we compare retrieval results from our HST + Spitzer 
transmission spectrum to those yielded from NIRISS/SOSS. 


4.1 A Joint Planetary Atmosphere and Stellar Contamination 
Retrieval Method 


The influence of active stellar regions on exoplanet transmission 
spectra — the transit light source effect (e.g., Rackham et al. 2018) 
— can be modelled simultaneously with the planetary atmosphere in 
retrievals. This has the advantage of yielding joint constraints on the 
planetary atmosphere and unocculted stellar heterogeneities while 
also propagating uncertainties from the star into the derived atmo- 
spheric constraints (see Rackham et al. 2023, for a review). Several 
retrieval studies have included prescriptions for unocculted starspots 
and/or faculae (e.g., Pinhas et al. 2018; Bixel et al. 2019; Iyer & 
Line 2020; Rathcke et al. 2021; Jiang et al. 2021; Rackham & de Wit 
2023; Thompson et al. 2023), with the most common being a three- 
parameter model that fits for the temperature and covering fraction 
of a single heterogeneity (Pinhas et al. 2018). While these treatments 
have generally sufficed for Hubble and ground-based data, the excep- 
tional data quality and wavelength coverage of JWST motivates the 
consideration of more complex stellar contamination models. 

We investigate arange of unocculted stellar heterogeneity prescrip- 
tions during our HAT-P-18b retrievals. Our goal is to determine 
the appropriate level of starspot and/or faculae model complexity 
required to interpret the JWST/NIRISS transmission spectrum of 
HAT-P-18b while also informing future JWST retrieval studies of 
hot Jupiters. We begin by summarizing the equations underlying the 
transit light source effect. 


4.1.1 The Transit Light Source Effect 


The transmission spectrum of an exoplanet transiting a star with a 
heterogeneous stellar surface can be written as (MacDonald & Lewis 
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2022) 


Aq = 6a, atm EA, het Ya, night (D) 


where Aj is the observed transmission spectrum, 62, atm is the 
transmission spectrum from the planetary atmosphere, €) het is the 
wavelength-dependent “stellar contamination factor” from a hetero- 
geneous stellar surface, and Y4, night accounts for thermal emission 
from the planetary nightside (we assume this final term negligibly de- 
viates from unity for HAT-P-18 b). For a planetary atmosphere with 
properties varying only with altitude (the 1D assumption), 6, atm is 
given by 


2 SPP recta) 
R? toy — 2 n be Tasim (b) qp 


(2) 


Oa, atm R2 
where Rp, top is the planetary radius at the top of the modelled at- 
mosphere, b is the ray impact parameter, R, is the stellar radius, 
and T3, slant İs the slant optical depth. For a stellar surface with Nhet 
unocculted heterogeneous active regions (e.g., spots or faculae), the 
stellar contamination factor, €), het, can be expressed as 


1 
EA, het = —y (3) 
al IA, het, i 
L- XA fn (1 = EE! i ) 
i=l A, phot 


where fhet,i is the fractional stellar disc coverage of the i™ het- 
erogeneous region, with corresponding specific intensity 1), het, i» 
while 1), phot is the specific intensity of the stellar photosphere. 
Equation 3 demonstrates that the stellar contamination factor de- 
viates from unity when a heterogeneity possesses a different in- 
tensity from the photosphere. In general, /) het, ; is shorthand for 
I, net, i ([Fe/H];, log gi, Ti), where [Fe/H]; is the local metallicity 
of the heterogeneity, log g; is the local log surface gravity, and T; is 
the heterogeneity temperature. 

Most retrieval studies consider a single population of unocculted 
heterogeneities with a temperature different from that of the pho- 
tosphere (but with the same metallicity and surface gravity). Given 
these assumptions, the stellar contamination factor is given by (e.g., 
Rackham et al. 2018; Pinhas et al. 2018) 


1 
Za, net (Thet) 
l, phot (Tphot) 


(4) 


EA, het = 
1- faer | 1 - 


where the three free parameters are the heterogeneity coverage frac- 
tion, fhet, the heterogeneity temperature, Thet, and the photosphere 
temperature, Tpnot- 

We also consider here a more general treatment of stellar contam- 
ination. Motivated by several strategic gaps identified in the NASA 
ExoPAG SAG 21 community report (Rackham et al. 2023), we also 
consider a two-heterogeneity model with the assumption of a com- 
mon surface gravity relaxed. By defining these heterogeneities as 
starspots and faculae (Tspot < Tpnot < Tfac), we can write the stellar 
contamination factor as 


1 
ele TA, spot (log gspot, Tspot) 
l- | 7 ZA, phot (log 8phot> | (5) 
f f _ IA, fac (log gfac, Trac) 
j ZA, phot (log 8phot> Tphot) 


The addition of a second heterogeneity population and separate sur- 
face gravities increases the number of free parameters for this stellar 
contamination model to eight. 


MNRAS 000, 1-20 (2023) 


10 Marylou Fournier-Tondreau et al. 


4.1.2 Unocculted Stellar Heterogeneity Models for HAT-P-18 b 


We consider five treatments for unocculted stellar heterogeneities 
when retrieving HAT-P-18 b’s transmission spectrum: 


(i) No stellar contamination. 

(ii) One heterogeneity, defined by three free parameters: fhet» 
Thet, and Tphot- 

(iii) Two heterogeneities (spots + faculae), defined by five free 
parameters: fspots Tspots frac» Tac, and Tphot- 

(iv) One heterogeneity with free surface gravity, defined by 
five free parameters: fhet, Thet, log Ehet, Tphot, and log gphot- 

(v) Two heterogeneities with free surface gravity, defined by 
eight free parameters: fspot» Tspot, 108g spot: Stac> Tfac» 108 8tac> Tphot: 
and log gphot- 


4.2 Retrieval Configuration 


We applied three exoplanet retrieval codes to HAT-P-18b’s trans- 
mission spectrum: PosEripon (MacDonald & Madhusudhan 2017; 
MacDonald 2023), Aurora (Welbanks & Madhusudhan 2021), and 
SCARLET (Benneke & Seager 2012; Benneke 2015). We initially con- 
ducted independent analyses to establish which atmospheric and stel- 
lar properties provide the necessary complexity to describe HAT-P- 
18 b’s transmission spectrum. After comparing our results, we chose 
a common set of model assumptions for the three codes: chemical 
opacity from Na, K, H20, CO, CO2, CH4, HCN, and NH3 (the 
most prominent opacity sources expected at HAT-P-18 b’s equilib- 
rium temperature under equilibrium chemistry and/or with chemical 
quenching, see e.g., Madhusudhan et al. 2016); an inhomogeneous 
grey cloud-deck combined with a power-law haze; and one unoc- 
culted stellar heterogeneity (i.e., the one heterogeneity model). Our 
retrievals were conducted on the R = 100 binned variant of HAT- 
P-18 b’s transmission spectrum with the spot-crossing masked (see 
Figure 3), but we found consistent results when retrieving the full- 
resolution NIRISS data. We describe the specific configuration used 
by each retrieval code below and list the priors used in Table 3. 


4.2.1 POSEIDON 


We conducted a series of PosEIDON retrievals to assess the model 
complexity required to fit HAT-P-18 b’s JWST/NIRISS transmission 
spectrum. This includes the five stellar contamination models listed 
in Section 4.1.2, nested models to compute detection significances 
for each chemical species, and additional robustness tests (e.g., a 
retrieval of the unbinned pixel-resolution spectrum). 

Our Poseipon retrievals model HAT-P-18b has a H2+He- 
dominated atmosphere with a standard configuration used for hot 
Jupiter retrieval studies (e.g., MacDonald & Madhusudhan 2017; 
Kirk et al. 2021; Taylor et al. 2023). We prescribe 100 layers, spaced 
uniformly in log-pressure from 10~8—100 bar, where the layers fol- 
low a variant of the pressure-temperature (P-T) profile parameteri- 
zation from Madhusudhan & Seager (2009) but with the reference 
temperature located at 10 bar. The inhomogeneous cloud and haze 
aerosol model follows MacDonald & Madhusudhan (2017). We fit 
for the log} volume mixing ratios of the 8 gases in the reference 
model, using absorption cross-sections (see MacDonald & Lewis 
2022) computed from the following line lists: Na and K (Ryabchikova 
et al. 2015); H2O (Polyansky et al. 2018); CO (Li et al. 2015); CO2 
(Tashkun & Perevalov 2011); CH4 (Yurchenko et al. 2017); HCN 
(Barber et al. 2014); and NH3 (Coles et al. 2019). We include H2-H32 
collision-induced absorption from HITRAN (Karman et al. 2019). 
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Table 3. Retrieval parameters and priors. 


Parameter POSEIDON SCARLET AURORA 
Composition 
log X; U [-12, -1] U [-12, -0.3] U [-12, -1] 
P-T profile 
ay U [0.02, 2.0] — U [0.02, 2.0] 
log Pı2 U[-8,2] — U [-8, 2] 
log P3 U [-2, 2] — U [-2, 2] 
Tref U [300, 2000] U [300, 1700] U [300, 1000] 
Aerosols 
log Chaze — U [-10, 5] = 
loga U [-4, 8] — U [-4, 8] 
Y U [-20, 2] — U [-20, 2] 
log Petouad U [-8, 2] U [-8, 2] U [-8, 2] 
Setoud U (0, 1] U (0, 1] U (0, 1] 
Stellar 
One het. 
fhet U [0.0, 0.5] U [0.0, 1.0] U [0.0, 0.5] 
Thet U [3500, 1.2 T,] = U [3500, 1.2 T,] 
AThet — U [-800, -50] — 
Tphot NIT., o7,] NIT:, OT,] NIT:, OT,] 
log het “U [3.0, 5.0] — — 
log 8phot NV [log gs, Clog g, ] — — 
Two het. 
fiac U [0.0, 0.5] — — 
Fspot U [0.0, 0.5] — — 
Tiac U IT; -3 oT,, 12T] — = 
Tspot U [3500, T; + 3 or, ] — = 
Tphot NIT:, o7,] — — 
log &fao “U [3.0, 5.0] — — 
log gspot “U [3.0, 5.0] — — 
log 8phot N [log gx, Clogg, ] — = 
Other 
Rp, ref U [0.85 Rp, 1.15 Rp] — — 
log Prep  — — U [-8, 2] 


Note: All three retrieval codes adopt the one heterogeneity stellar contam- 
ination model for this comparison. Tef is defined at 10 bar for PosErpon, 
1078 bar for Aurora, and is the isothermal temperature for SCARLET. All 
pressure parameters are expressed in units of bar and temperatures in K. For 
the priors, we adopt Rp = 0.995 Rj, Ts = 4803K, logg» = 4.57 (cgs), 
or, = 80K, and Clogg, = 0.04 (cgs). All retrievals have a fixed stellar 
metallicity of [Fe/H] = 0.1, while those without free surface gravity assume 
log g = log g, = 4.57 in all regions. ‘—’ indicates that the parameter did not 
feature in the retrieval. All log parameters are base 10. 


We also fit for the planetary radius at a reference pressure of 10 bar. 
We calculate model transmission spectra from 0.55—2.9 um at R = 
20,000. We calculate the stellar contamination factor (Equation 3) 
by interpolating PHOENIX models (Husser et al. 2013) using the 
PyMSG package (Townsend & Lopez 2023). Our PosEIDON retrievals 
have up to 27 free parameters (depending on the chosen stellar con- 
tamination model from Section 4.1.2), with 1,000 nested sampling 
live points used by the PyMultiNest (Feroz et al. 2009; Buchner 
et al. 2014) package to chart the parameter space. 


4.2.2 SCARLET 


We performed retrievals using the SCARLET retrieval framework 
(Benneke 2015; Benneke et al. 2019a). We improved the SCARLET 
framework from previous work on low-resolution observations (Ben- 
neke et al. 2019a,b; Piaulet et al. 2023) with the addition of non- 
uniform cloud coverage and of the contribution of stellar contamina- 
tion to the transmission spectrum. 

In the SCARLET retrieval, the atmosphere is parameterized by the 
abundances of the spectrally-active chemical species of interest (Na, 
K, H20, CO, CO2, CH4, HCN, NH3) which are assumed to be well- 
mixed and constant with pressure, and an isothermal temperature 
structure. The ratio of He and H2, which make up the remainder of 
the gas, is set to Jupiter’s He/Hp of 0.157. 

The retrieval includes three parameters describing aerosols 
(log Pcloud» feloud, 10g Chaze). The cloud parameterization consists 
of a grey cloud, opaque across all wavelengths, with a cloud top 
pressure Peloud: Potential fractional cloud coverage is captured by 
the fcloud parameter and corresponds to a weighted average of a 
cloud-free and a cloudy model. The contribution of haze to the 
transmission spectrum via a slope towards shorter wavelengths is 
parameterized using the Chaze parameter, which is a factor that mul- 
tiplies the Rayleigh contribution to the scattering coefficient (unity 
corresponds to standard Rayleigh scattering). 

Finally, two parameters describe the stellar heterogeneity: ATspot 
and fspot- Stellar heterogeneity is implemented following the standard 
approach (see Section 4.1.1) assuming that the contribution of stellar 
contamination to the spectrum can be represented by that of spots 
with a temperature of Tspot = Tphot + ATspot (where Tphot is the 
photosphere temperature) covering a fraction fspot of the star. Stellar 
models are queried from the PHOENIX grid of stellar atmosphere 
models (Husser et al. 2013). 

The forward models calculated by the retrieval have a resolving 
power of 15,625. We use opacity sampling from cross-section tables 
at a resolving power of 250,000 to build the opacity tables used, 
and the opacities are taken from the ExoClimes Simulation Platform 
database. We choose HITEMP opacities for H20, CO2, CH4, NH3 
(Rothman et al. 2010), and ExoMol opacities for all other species 
(Tennyson et al. 2016). We use Nested Sampling to sample the pa- 
rameter space and its Python implementation in the nestle module 
(Skilling 2004; Mukherjee et al. 2006; Shaw et al. 2007). 


4.2.3 AURORA 


We also perform atmospheric retrievals with Aurora (Welbanks 
& Madhusudhan 2021) to interpret the transmission spectrum of 
HAT-P-18 b. The application of Aurora to transmission spectra of 
exoplanets is described in Welbanks & Madhusudhan (2022) follow- 
ing the general setup presented Welbanks & Madhusudhan (2019); 
Welbanks et al. (2019). We follow the methods of Pinhas et al. (2018) 
to consider the impact of stellar heterogeneities as implemented in 
previous works (see e.g., Ahrer et al. 2023). 

Aurora computes line-by-line radiative transfer in transmission ge- 
ometry for a parallel-plane atmosphere assuming hydrostatic equilib- 
rium. Our atmospheric model is computed using a 100 layer pressure 
grid uniformly spaced in log-pressure between 107 8 and 100 bar, and 
a wavelength grid from 0.55 um to 2.9 um, covering this NIRISS ob- 
servation, at a constant resolution of R = 20,000. We parameterize the 
P-T profile of the atmosphere using the prescription from Madhusud- 
han & Seager (2009) and the presence of inhomogeneous clouds (e.g., 
Line & Parmentier 2016) and hazes (e.g., Lecavelier Des Etangs et al. 
2008), using a single sector for their combined spectroscopic imprint 
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Figure 7. Contributions of planetary atmosphere and stellar features to HAT- 
P-18b’s transmission spectrum. The best-fitting model spectrum from the 
PosEIDON one heterogeneity retrieval (dashed line) — see Section 4.4 — is 
compared to the NIRISS/SOSS data (error bars) for reference. The coloured 
lines show the effect of removing the cloud (purple) and starspot (red) from 
the best-fitting model, while the other lines show models with combinations 
of a cloud, starspot, H2 continuum opacity, and a single atmospheric chemical 
species (Na: magenta; H20: blue; CO2: orange) or no additional absorbers 
(grey). HAT-P-18 b’s transmission spectrum is therefore well-explained by 
unocculted starspots, a cloud deck, Na, H20, and COp. 


following the description in Welbanks & Madhusudhan (2021). The 
volume mixing ratios for the eight chemical species considered in 
our fiducial model are assumed to be constant with height and use 
cross-sections for Hy—H» and H2—He collision induced absorption 
(CIA; Richard et al. 2012), CH4 (Yurchenko & Tennyson 2014), CO 
(Rothman et al. 2010), CO, (Rothman et al. 2010), H20 (Rothman 
et al. 2010), HCN (Barber et al. 2014), K (Allard et al. 2016), Na 
(Allard et al. 2019), and NH3 (Yurchenko et al. 2011), computed as 
described in Gandhi & Madhusudhan (2017, 2018) and Welbanks 
et al. (2019). Following Pinhas et al. (2018), we model the hetero- 
geneous stellar photosphere by interpolating stellar models from the 
PHOENIX database (Husser et al. 2013). The parameter estimation 
is performed using the nested sampling algorithm (Skilling 2004) 
with MultiNest (Feroz et al. 2009) and its python implementation 
PyMultiNest (Buchner et al. 2014). 


4.3 An Explanation for HAT-P-18 b’s Transmission Spectrum 


We first provide an intuitive explanation for HAT-P-18 b’s transmis- 
sion spectrum. Figure 7 shows a spectral decomposition of the maxi- 
mum likelihood (best-fitting) model transmission spectrum from the 
PosEIDON one heterogeneity retrieval (model (ii) in Section 4.1.2). 
We detect multiple H20 absorption features near 0.95 wm, 1.15 um, 
1.4 um, 1.9 um, and 2.6 um (with a combined significance of 12.5 o`). 
We further detect a CO, absorption feature near 2.7 um (7.3 o) and 
infer evidence of Na at lower significance (2.7 o). The small am- 
plitude of the absorption features is due to an optically thick cloud 
deck which is uniform around the terminator (7.4 o). The inference 
of non-patchy clouds is driven by the wing shape of molecular bands 
(MacDonald & Madhusudhan 2017) — especially the H20 band cen- 
tred near 1.4 wm — and the flat continuum from 1.6—1.8 um. Finally, 
the spectral slope shortwards of 1.65 um is caused by the combi- 
nation of unocculted starspots (5.8 o) and the uniform cloud deck. 
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Figure 8. Pixel-resolution transmission spectrum of HAT-P-18 b around the 
1.083 um helium triplet. We model the helium triplet as a Gaussian convolved 
to the resolution of NIRISS/SOSS and fitted to the data. The best-fit is shown 
in blue, and the unconvolved Gaussian in green. The central wavelength of 
the metastable helium triplet is indicated as the vertical dashed line. 


The quoted detection significances are from Bayesian model compar- 
isons between the reference PosEIDON one heterogeneity model and 
another retrieval with one model component removed (e.g., Benneke 
& Seager 2013). We do not identify statistically significant evidence 
for K, CO, CH4, HCN, NH3, nor a scattering haze. We note that 
these conclusions are independently retrieved by the SCARLET and 
Aurora retrievals, which produce similar best-fitting spectra (see 
Section 4.4). Finally, we determine below that a single high data 
point near 1.1 um, not fit by our retrieval models, is attributable to 
metastable helium. We verified that removing this data point does 
not alter the retrieval results. 


4.3.1 Evidence of Helium Absorption 


We assessed evidence of helium absorption in HAT-P-18 b’s atmo- 
sphere via an independent fit to the pixel-resolution transmission 
spectrum near 1.083 um. Figure 8 shows that a Gaussian fit, with free 
amplitude and width, favours an excess absorption of 0.13 + 0.03 % 
(4.3 o`) centred around the near-infrared helium triplet at 1.083 um. 
Our result is consistent within 1 ø to Fu et al. (2022), Vissapragada 
et al. (2022), and Paragas et al. (2021). Since the helium line profile 
is unresolved with NIRISS/SOSS, even at pixel resolution, we do not 
conduct more complex modelling of the helium line. Consequently, 
we cannot unambiguously attribute this signature to the planetary 
atmosphere nor disentangle the layers probed by the helium triplet. 
Nonetheless, we confirm that HAT-P-18b exhibits a longer transit 
duration with more absorption post transit — as also observed by Fu 
et al. (2022) — which could indicate an extended atmosphere with 
the presence of a cometary-like tail as exhibited by WASP-107b 
(Allart et al. 2019). 

Future observations of HAT-P-18b at higher spectral resolution 
can confirm the helium absorption. A confirmation would indicate 
that HAT-P- 18 b is losing its atmosphere and is undergoing significant 
mass loss. We estimated the intrinsic excess absorption arising from 
the resolved helium triplet to be 3.17 + 0.67 % (Figure 8). This value 
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was derived by fitting a Gaussian with a fixed width of 1A (the 
typical width of the helium triplet; e.g., Allart et al. 2018, 2019; 
Nortmann et al. 2018; Salz et al. 2018) and a free intrinsic amplitude, 
which was then convolved to the pixel-resolution NIRISS/SOSS data. 
Such a strong helium signature would be readily detectable with 
ground-based high-resolution spectrographs, which would provide 
complementary observations to study the unique processes shaping 
HAT-P-18 b’s upper atmosphere. 


4.4 The Atmosphere of HAT-P-18 b 


Our retrieved atmospheric properties for HAT-P- 18 b are summarized 
in Figure 9. All three retrieval codes obtain retrieved transmission 
spectra consistent with the picture presented previously: HAT-P- 
18b’s observed spectrum is shaped by absorption from HzO and 
CO» (and likely Na), unocculted starspots, and a cloud deck. The 
posterior distributions in Figure 9, and corresponding confidence 
regions in Table 4, provide our quantitative constraints on HAT-P- 
18 b’s atmospheric properties. 

We obtain precise constraints on several atmospheric proper- 
ties, including the HO and CO», abundances, cloud deck pres- 
sure, and atmospheric temperature. The retrieved H2O abun- 


dance (log Xy,0 = —4,47+0.30 (Posemon); —4.1 13 (SCARLET); 


—4.65+0:24 (AURroRA)) is ~ 10x lower than the expected abun- 
dance for a solar-composition atmosphere under chemical equilib- 
rium at HAT-P-18 b’s equilibrium temperature (Hartman et al. 2011). 


Conversely, the retrieved CO2 abundance (log Xco, = -4.860 


(PoserDon); —4.39+0.38 (SCARLET); -5.161940 (AurorRA)) is > 
100x higher than expected for a similar solar-composition at- 
mosphere (cf. Woitke et al. 2018). The retrieved Na abun- 
dance (log Xna = -6.9903 (PosEIDon); -5.45+0- (SCARLET); 
-6.6510-% (AurorA)) is more weakly constrained, being broadly 
consistent with a range of sub-solar to solar (PosEIDON and AURORA) 
or even a super-solar composition (SCARLET). An opaque cloud 
deck, consistent with uniform terminator coverage (felouq > 0.83, 
see Table 4), is localized with a cloud top pressure of ~ 30 mbar 


(log Peloud = —1.52*0:19 (Posew on); -1.53+0;10 (SCARLET); 


1,400.07 (AurorA)). Finally, the retrieved P-T profile, over the 
pressure range probed by our NIRISS/SOSS observation, is isother- 
mal at ~ 600K (Tef = 616+ K (Posewon); aoa K (SCARLET); 
673+% K (Aurora)). We discuss the sensitivity of these inferences 
to the assumed stellar contamination model in Section 4.5.3. We note 
that, while SCARLET favours slightly higher abundances and lower 
temperatures than PosEIDON and Aurora, all three codes produce 
consistent retrieval results to within 1 o. The small differences be- 
tween our retrieval results may be due to the slightly different model 
configuration and priors used by SCARLET (see Table 3). 

We also obtain robust upper limits on the abundances of several 
important chemical species. Our retrievals strongly disfavour the 
presence of K (log Xg < —10 to 2), which could be due to con- 
densation out of the gas phase. Similarly, we establish that CH4 is at 
least 100x less abundant than the expectation for a solar-composition 
atmosphere in chemical equilibrium (log Xcy, < —6 to 2o vs. the 
expected log Xcu, ~ —4; Woitke et al. 2018), which could indicate 
photochemical dissociation (e.g., Moses et al. 2011; Venot et al. 
2012). Although we see a tentative hint of CO in our posteriors (see 
Figure 9), additional observations at longer wavelengths, covering 
the 4.6 um CO band, would be required to robustly constrain the 
CO abundance (and hence the atmospheric C/O ratio). We also find 
upper limits on HCN and NH3 (log Xycn <$ —5; log XNH; S —6), 
which are consistent with both equilibrium and disequilibrium ex- 
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Figure 9. Atmospheric and stellar retrieval results for HAT-P-18 b. Top: Retrieved model transmission spectra from our JWST/NIRISS spectra (black errors). 
For each of the three retrieval codes (PosEIDon; purple, SCARLET; orange, and Aurora; blue), the median retrieved spectrum (solid lines) and 1 o confidence 
intervals (shaded contours) are shown. The most important model features required to explain HAT-P-18 b’s NIRISS transmission spectrum are annotated. All 
three codes adopt the one heterogeneity model for this comparison. Bottom: Posterior probability distributions corresponding to the retrieval model in the top 
panel. The top row shows retrieved atmospheric properties with constrained values, the middle row shows non-detected chemical species with abundance upper 
limits, and the bottom row shows the retrieved unocculted starspot properties. Each histogram is annotated with the retrieved median and + 1 o confidence 
intervals from the Poseo retrieval for reference (see Table 4 for the results from all three codes). 


pectations for hot giant planets (e.g., Moses et al. 2011; Venot et al. 
2012). These upper limits underline that SOSS data is of sufficiently 
high quality to provide scientifically meaningful constraints even on 
the abundances of non-detected chemical species. 


4.5 Unocculted Stellar Heterogeneities 


We next turn to constraints on unocculted stellar heterogeneities. 
We first present results under the assumption of a single unocculted 
heterogeneity population with fixed log g — a common approach in 
the literature — before relaxing this assumption by considering more 
complex treatments of the TLSE. Finally, we examine the impact of 
different unocculted stellar heterogeneity models on the retrieved 
atmospheric properties of HAT-P-18b. 


4.5.1 One Heterogeneity Stellar Contamination Model 


Assuming a single population of unocculted stellar active regions, 
our retrievals require unocculted starspots to explain HAT-P-18 b’s 
transmission spectrum (see Figure 9 and Table 4). The spots are ~ 
300K cooler than the photosphere (Thet = 4487+ K (Posepon),; 


4408+16 K (SCARLET); 4513+$3 K (Aurora), compared to Tphot ~ 


4800 K) and cover ~ 10 % of the stellar surface (fhet = ote 


(POSEIDON); 0.10+0:01 (SCARLET); 6.1200 (AurorA)). We note 
that SCARLET has smaller uncertainties on the retrieved stellar pa- 
rameters, which we attribute to the different parameterization of the 


stellar heterogeneity and hazes in the prior. 


4.5.2 More Complex Stellar Contamination Models 


We find that the best-fitting stellar contamination model for HAT- 
P-18 b is two distinct populations of unocculted heterogeneities — 
spots and faculae — with different surface gravities. We established 
this via a series of Bayesian model comparisons between the five 
stellar contamination models listed in Section 4.1.2. Our results, 
summarized in Table 5, demonstrate a significant improvement in 
the Bayesian evidence for the two heterogeneities with free surface 
gravity model compared to the fiducial one heterogeneity model we 
have focused on thus far (a Bayes factor of 77.5, equivalent to 3.4 o`). 
We also find that the preference for unocculted heterogeneities vs. 
no stellar contamination is greater for the two heterogeneities with 
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Figure 10. Unocculted stellar heterogeneity properties from the Posemon retrieval analysis. Posterior distributions from four retrieval models are shown, from 


top to bottom: (i) a single population of stellar heterogeneities (spots) with log gspot = 


log Zphot = 4-57, as in Figure 9 (red); (ii) two distinct populations of spots 


and faculae, all with log g = 4.57 (orange); (iii) same as the spot model, but with the log g of the spots and photosphere as free parameters (gold); and (iv) same 
as the spot + faculae model, but with free log g for all three stellar components (green). The histograms are ordered such that the columns (within each row) 
have the same physical interpretation between the different retrievals (e.g., fhet from the one heterogeneity models corresponds to fspot in the two heterogeneities 
models). The retrieved atmospheric properties for HAT-P-18 b are consistent across all four unocculted stellar heterogeneity models. 


free surface gravity model compared to the one heterogeneity model 
(6.50 vs. 5.80). Nevertheless, we show below that adopting the 
preferred stellar contamination model does not significantly alter the 
atmospheric constraints for HAT-P-18 b presented in Figure 9. 


We show in Figure 10 that the choice of stellar contamination 
model can significantly alter inferences about unocculted stellar ac- 
tive regions. The most significant change is that retrievals with free 
log g favour much cooler spots (AT ~ —800K vs. ~ —300 K) than 
those with log gspot = log gpnot- Indeed, the retrieved spot surface 
gravity is lower than the photosphere by > 2.0. We similarly find 
that fitting for log gfac allows a more accurate determination of 
the faculae temperature compared to fixing the surface gravities. 
Overall, our preferred solution for HAT-P-18’s unocculted active re- 
gions is as follows: spots and faculae cover fgpot = aes % and 
fiac > 4% (20 lower limit) of HAT-P-18’s surface, with temper- 
atures of Tspot = 4045+162 K and Tac = 4952+62 K and surface 
gravities of log gspot < 4.17 (2 o upper limit) and log gf. > 4.45 
(2 o lower limit). These compare to a background photosphere with 

Tphot = 4840+50 K and log gphot = 4. 56+9; ms These results demon- 
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strate that more complete stellar contamination models can provide 
deeper insights into the nature of unocculted stellar active regions. 


4.5.3 Sensitivity of Atmospheric Properties to Stellar 
Contamination Models 


Finally, we find that, provided the spectroscopic imprint of unoc- 
culted heterogeneities is considered in the atmospheric model, the 
adopted stellar contamination model only minimally impacts the re- 
trieved atmospheric properties for this data. Figure 11 shows that our 
preferred stellar contamination model (two heterogeneities with free 
surface gravity) produces retrieved temperatures, cloud pressures, 
and H20 and CO, abundances consistent with the simpler spot- 
only model with fixed surface gravity (one heterogeneity). However, 
a retrieval not including stellar contamination would attribute the 
spectral slope (caused by spots and a cloud deck) to an atmospheric 
haze and erroneously retrieve a H20 abundance 1-2 ø higher than 
when spots are included. Therefore, while the finer details of the 
stellar contamination model are less crucial if one only wishes to 
constrain the planetary atmosphere, a simple stellar contamination 


Table 4. Retrieval results for the one heterogeneity model. 


Parameter POSEIDON SCARLET AURORA 
Composition 
0.30 0.27 0.24 
log Xm, o -4.47 028 -4-111032 -4.65022 
0.59 0.69 0.42 
log XNa -6.99+ 509 -5.45150 -6.6518 
logXco, -486945 4.399038 511040 
log Xk < -10.26 < -9.87 <-10.11 
log Xcu, < -6.29 < -6.12 < -6.30 
log Xco < -2.58 < -2.26 < —2.85 
log Xycn < -5.51 < -5.01 < -5.56 
log XNH, < -6.01 < -6.13 < -6.21 
P-T profile 
99 81 90 
Tief 61673 523*%5 673+ 35 
Aerosols 
log Paloud -1.52910 -1.53+010 -1.4090 
Fetoud > 0.87 > 0.83 > 0.90 
Stellar 
0.03 0.01 0.04 
foet 0. 100.02 0.1001 0.12" 603 
Thet 448715 ‘4 4408} 12 451 300 
4 22 62 
Tpnot 4817") 4740+10 4821" %0 


Note: We do not list results for unconstrained parameters (e.g., the other P-T 
profile parameters, since the retrieved profiles are essentially isothermal). ‘<° 
and ‘>’ represent 2 o upper and lower limits, respectively. 


Table 5. Unocculted stellar heterogeneity Bayesian model comparison. 


Model In Zi Bret i In Bees i Det. Sig. 
Two het. (free log g) 984.74 Ref. Ref Ref. 

No stellar contam. 965.54 2.19 x 108 19.2 650 
One het. 980.39 77.5 4.35 340 
One het. (free log g) 981.53 25.0 3.22 3.00 
Two het. 983.57 3.24 1.18 210 


Note: In Z; is the Bayesian evidence of the i” model. Bret i is the Bayes factor 
of the reference model with respect to the i® model. “Det. Sig.” indicates the 
detection significance, expressed in equivalent “o-” (e.g., Benneke & Seager 
2013), for the reference model, highlighted in bold, vs. the itt model. 


model (e.g., one heterogeneity, as considered in studies such as Pin- 
has et al. 2018 and Rathcke et al. 2021) is vital to obtain reliable 
atmospheric inferences for exoplanets transiting active stars. 


4.6 A Comparison between HST + Spitzer and JWST 


We additionally ran retrievals on our newly reduced HST and Spitzer 
observations to contextualize our JWST results. We consider two 
PosEIDON retrievals, a model without stellar contamination and with 
a single unocculted stellar heterogeneity, adopting the same retrieval 
configuration and priors as used in the JWST analysis (see section 4.2 
and Table 3). The only difference compared to our previous approach 
is an altered model wavelength grid (0.9-5.3 um at R = 20,000) to 
cover the Spitzer data at longer infrared wavelengths. 

Figure 12 compares our retrieved spectrum and atmospheric con- 
straints from HST and Spitzer to our previously presented JWST 
results. We focus on the abundances of H20, CO, and the cloud pres- 
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Figure 11. Impact of unocculted stellar contamination model on retrieved 
atmospheric properties. Posterior distributions are shown from three Poser- 
DON retrieval models: no stellar contamination (black), one heterogeneity 
(red), and two heterogeneities with free surface gravity (green). The one 
heterogeneity (“spots”) model corresponds to the results in Figure 9. With- 
out considering stellar contamination, one would retrieve a H2O abundance 
biased 1-2 o higher and find no cloud deck. The CO, abundance and atmo- 
spheric temperature are less sensitive to the stellar contamination model. 


sure for this comparison since these are the most well-constrained 
model parameters from our JWST/NIRISS data (see section 4.4). 
Both retrieval models infer an attenuated absorption feature in the 
HST/WFC3 data, suggesting a degree of cloud opacity (log Peloud S 
—2 to lo). The model without stellar contamination attributes the 
absorption near 1.1 and 1.4 um to H20, with an abundance con- 
straint at the order-of-magnitude level (log Xy,0 = -2.43+0-87). 
However, the unocculted starspot model, which obtains an improved 
fit to the Spitzer data, finds the HST/WFC3 data can be explained 
by either H20 absorption or CH4 in combination with starspots. 
This three-way degeneracy between fhet, log Xy,0, and log XcH, — 
caused by the lack of data shortwards of 1.0 uym — removes the H2O 
detection one could claim under the assumption of no stellar con- 
tamination. Therefore, HAT-P-18 b’s H20 abundance is essentially 
unconstrained from the HST and Spitzer observations. In contrast, 
our JWST/NIRISS observation has the wavelength coverage to break 
the degeneracies between unocculted starspots and the atmospheric 
composition, affording detailed characterization of HAT-P-18 b’s at- 
mosphere. 


5 SUMMARY & DISCUSSION 


This work has led to important inferences on both the stellar activity 
of HAT-P-18 and the atmosphere of its hot Saturn. Our main results 
are as follows: 


e The transit light curves show evidence of a spot-crossing event 
(>5o); we inferred that the most likely parameters for that occulted 
spot are a position on the projected stellar surface of (x, y) = (0.090 
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Figure 12. Atmospheric retrieval of Hubble and Spitzer transmission spectra 
of HAT-P-18b. Left: Retrieved spectrum assuming no stellar contamina- 
tion (black) and with a single heterogeneity (red), shown via their median 
(solid lines) and 1 o confidence intervals. Right: Corresponding posterior 
probability distributions. The JWST NIRISS/SOSS retrieval results from the 
single heterogeneity PosEIDon retrieval (see Figure 9) are overlaid, and the 
retrieved median and + 1 o parameter constraints annotated, demonstrating 
that NIRISS/SOSS data provides significantly improved constraints on the at- 
mospheric composition and cloud pressure compared to Hubble and Spitzer 
data. 


+ 0.005, 0.42 + 0.05) Rx, with a radius of 0.116 + 0.014R«, anda 
temperature colder than the star of AT = -93 + 15 K. 

e The main features in the transmission spectrum retrieved from 
NIRISS/SOSS observation are multiple absorption features produced 
by water vapour (12.5 o) with a retrieved abundance of log H2O 
x —4.4 + 0.3, a rise at redder wavelengths due to a CO» absorption 
feature (7.3 0) and evidence of Na (2.7). Also, there is a slope 
towards bluer wavelengths caused by unocculted starspots (5.8 o`) 
and a uniform, grey cloud deck (7.4.0) that mutes some absorption 
features. 

e Modelling stellar heterogeneities led to four slightly different 
solutions for the occulted spot, and we showed that different treat- 
ments of the unocculted active regions could be used. Fortunately, we 
found that the different solutions for the active regions had no signif- 
icant impact on the retrieved transmission spectrum and atmospheric 
properties of HAT-P-18 b. 

e Modelling spot spectra is best achieved with stellar models with 
lower surface gravities than the stellar photosphere. For the most 
likely solution of the occulted spot, that is a A log g = 1.16 + 0.19 dex, 
which is in perfect agreement with our solutions for the unocculted 
spots with free log g. 


We proceed to discuss the implications of our results and their 
potential impact on exoplanetary atmosphere studies in transmission 
spectroscopy. 


5.1 The Atmosphere of HAT-P-18 b in Context 


The best-fitting atmosphere model confirms the presence of H20 and 
clouds in the terminator region of HAT-P-18 b, which is in agreement 
with previous work (Tsiaras et al. 2018). As in Fu et al. (2022), our 
transmission spectrum shows a slope towards bluer wavelengths; 
however, to explain it, instead of unocculted stellar heterogeneities, 
these authors inferred Rayleigh haze scattering, as also suggested by 
Kirk et al. (2017) from a ground-based observation. The TLSE, which 
was not taken into consideration in those studies, is likely responsible 
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for that difference, as well as the 10x higher water abundance in Fu 
et al. (2022). We also find 10x less CO3 than was reported by Fu et al. 
(2022), and we furthermore cannot confirm their detection of CH4; 
both differences could come from the data reduction and modelling 
differences. 

We show in Figure 13 our retrieved chemical abundances in com- 
parison to an equilibrium chemistry model from FastChem 2 (Stock 
et al. 2022) at 600 K with a metallicity of 0.1 dex anda solar C/O ratio. 
We selected 10x sub-solar metallicity for this reference equilibrium 
model based on our retrieved sub-solar H2O abundance (which is 
our most precisely constrained abundance), allowing us to compare 
the other retrieved molecular abundances to equilibrium expecta- 
tions. The retrieved Na abundance is consistent with this equilibrium 
model, as are the non-detections of CO, HCN and NH3. On the other 
hand, our retrieved abundance of CO3 is significantly more abundant, 
whereas CH4 and K are depleted by at least two orders of magni- 
tude. We also computed equilibrium chemistry models with different 
C/O ratios, but the abundances did not change significantly. The dif- 
ferences between our free-retrieval results and the expectations from 
equilibrium chemistry may be indicative of either 1) non-equilibrium 
effects, like photochemistry for CH4 and clouds for K or 2) a demon- 
stration of the sensitivity of free-retrievals to individual points which 
could be driving an anomalously high CO2 abundance. Methane, 
which would be expected at the temperature of HAT-P-18 b, was also 
not detected in the atmosphere of the slightly warmer hot-Saturn 
WASP-39 b (Feinstein et al. 2023). The C/O ratio is still uncertain 
for HAT-P-18b’s atmosphere and the need for longer wavelength 
observations with other instruments, such as NIRSpec G395H, are 
required to measure all the main carbon-bearing species and further 
investigate possible CH4 signatures in the infrared. 

Our detection of CO2 may be spurious, and the abundance estimate 
unreliable. An edge effect in the NIRISS observation may be the 
culprit, as the CO7 band is right at the red end of the SOSS spectrum, 
even exceeding it slightly. Longer wavelength data with the Near 
Infrared Spectrograph (NIRSpec) G395H grating would allow the 
detection of CO to be confirmed and its abundance refined. 

Our atmospheric analyses consistently find a uniform cloud deck 
as the only aerosol required to explain HAT-P-18 b’s NIRISS/SOSS 
transmission spectrum. Atmospheric studies of hot Jupiters often 
predict a degree of patchy clouds in hot Jupiter atmospheres (e.g., 
Parmentier et al. 2016; Komacek et al. 2022). However, despite al- 
lowing for patchy clouds, our retrievals all favor terminator cloud 
fractions > 83% (to 20). Future general circulation model stud- 
ies focused on HAT-P-18 b can further investigate whether uniform 
cloud coverage is more favoured for this Saturn-mass planet with a 
colder equilibrium temperature (~ 850 K) than many hot Jupiters. 
Additional observations in the infrared will also be informative. 


5.2 Legacy of Hubble & Spitzer in Light of JWST Results 


For our HST and Spitzer transmission spectrum, the model with- 
out stellar contamination finds an abundance of H2O (log H20 = 
-2.43+0-87) and a cloud pressure (log Pejoug S —2 ~% 10 mbar) in 
good agreement with the findings of Tsiaras et al. (2018) (log H20 = 
-2.63 + 1.18 and log Pojoyg = -2.18 + 0.91 ~ 6.6 mbar). However, we 
showed that a model considering the stellar contamination with a sin- 
gle heterogeneity removes the H20 detection because of a three-way 
degeneracy between the coverage fraction of heterogeneities and the 
abundances of H20 and CHy due to the lack of the visible spectrum 
with HST/WFC3. 

The impact of spots on optical transmission spectra has been 
known for some time to mimic a Rayleigh scattering slope (e.g., 
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Figure 13. Comparison between an equilibrium chemistry model (coloured 
curves) and the retrieved mixing ratios from the PosEIDON one heterogeneity 
model (error bars and arrows). The error bars correspond to 1 o constraints 
for the inferred chemical species (H20, CO2, and Na), whilst the arrows cor- 
respond to 20 upper limits for the non-detected species. The FastChem 2 
equilibrium model shown assumes a solar C/O ratio and 10x sub-solar metal- 
licity (chosen based on the retrieved H20 mixing ratio) and a temperature 
consistent with the retrieved terminator temperature (600 K). 


McCullough et al. 2014; Rackham et al. 2018). Also, for exoplanets 
orbiting red dwarf stars, the 1.4 wm water absorption feature observed 
with HST/WFC3 was sometimes debated to come from potential wa- 
ter molecules in their cold starspots (e.g., Barclay et al. 2021). Now, 
with NIRISS/SOSS, we can break the degeneracy between a hazy 
atmosphere and starspots, and potentially for colder stars, we could 
confirm if water absorption features come from starspots or the exo- 
planetary atmosphere. 

HST and Spitzer inference studies have shown that water vapour, 
alkali metals, as well as clouds are common in hot giant planets 
(e.g., Madhusudhan 2019). Still, there often remained a degeneracy 
between water and clouds, preventing robustly retrieving the abun- 
dance of H20 and the location of the cloud deck (e.g., Benneke & 
Seager 2012, 2013). In this work, we confirmed these detections of 
water vapour, clouds and potentially sodium in a hot giant atmo- 
sphere, and we demonstrate that the improved wavelength coverage 
provided by NIRISS/SOSS is sufficient to break the cloud-metallicity 
degeneracy. 


5.3 Challenges of Occulted Spot Analysis in the Era of JWST 


Our work adds to the growing body of literature demonstrating that 
the inference of the transit and occulted spots properties with JWST 
observations can and should be done to avoid possible biases in 
atmospheric estimates (e.g., Bixel et al. 2019; Rackham et al. 2023). 
A semi-analytic tool like spotrod (Béky et al. 2014), used here, is 
less computationally expensive compared to other tools that use a 
pixelation approach to model the stellar disc. Parallelization of those 


HAT-P-18 b with JWST NIRISS/SOSS 17 


tools could be a simple solution to further improve the computation 
time. 

Moreover, we found that modelling active features from light 
curves, with the current tools and knowledge, is a degenerate prob- 
lem. The standard practice of fixing the orbital parameters using 
a white light curve fit, which also fixes the occulted spot size and 
position, may prevent ruling out some families of solutions. There- 
fore, we are currently developing a package to simultaneously fit all 
spectral channels using both wavelength-dependent and wavelength- 
independent parameters. Still, theoretical advances are needed to 
understand the limits of inferences from occultations and break the 
degeneracy between starspot size and temperature. Fortunately, those 
degeneracies do not impact the retrieved transmission spectrum sig- 
nificantly. Furthermore, spotrod assumes that active features have 
a circular shape and the same limb-darkening law as the star. This 
can also impact the retrieved properties and make it more difficult 
to lift some degeneracies. The occulted feature may, moreover, have 
a more complex structure consisting of a cool spot and a hot facula, 
which could explain the smaller temperature difference retrieved for 
the occulted spot compared to the unocculted spots. Nonetheless, the 
retrieved stellar fraction of the occulted spot foc. spot = 1.35+0.02 % 
is coherent with our preferred solution for the coverage fraction of 
unocculted spots funoc. spot = 2e f. 


5.4 Accounting for Unocculted Heterogeneities in JWST 
Transmission Spectra 


Our best-fitting model suggests that the slope towards bluer wave- 
lengths in this NIRISS/SOSS observation comes mainly from unoc- 
culted stellar heterogeneities. Our work shows that we should con- 
strain planetary atmosphere and stellar contamination properties si- 
multaneously with JWST NIRISS/SOSS data; otherwise, there is a 
risk of biasing atmospheric inferences. However, we could not distin- 
guish clearly which of the four stellar contamination treatments in- 
vestigated (one and two heterogeneities, one and two heterogeneities 
with free surface gravity) is the most plausible, but we were able 
to show that a particular choice of treatment does not significantly 
impact the atmospheric inferences. 

Furthermore, the spectra of active regions are currently modelled 
with 1D stellar models, which is another limitation, particularly for 
faculae (Norris et al. 2017; Johnson et al. 2021), because they fail to 
capture aspects of facular contrasts, such as the center-to-limb varia- 
tion (e.g., Yeo et al. 2013) and the dependence on stellar metallicity 
(Witzke et al. 2018). More theoretical work, such as magnetohydro- 
dynamic simulations of magnetic features, is needed to understand 
better and, thus, model the contrast of active regions more accurately 
as a function of stellar fundamental parameters and activity. In the 
meantime, we suggest that spots’ flux spectra be modelled with stellar 
models with lower surface gravities than the star. 


6 CONCLUSION 


We presented atmospheric retrieval analyses of two transmission 
spectra of HAT-P-18b obtained with JWST NIRISS/SOSS, and 
HST/WFC3 with Spitzer/IRAC. We also modelled a spot-crossing 
event in the transit observed with NIRISS/SOSS. We confirmed that 
the wavelength coverage and spectral resolution of NIRISS/SOSS 
considerably improve the detection significance and the abundance 
constraints on chemical species compared to observations from HST 
and Spitzer combined. In including stellar heterogeneities in transit 
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fits and atmospheric retrievals, we implemented new model consid- 
erations designed to fit the local surface gravity of stellar hetero- 
geneities. This work is instructive for the upcoming JWST transit 
observations, particularly by informing the community on disentan- 
gling stellar and planetary atmosphere signals. 
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SOFTWARE 


— astropy; Astropy Collaboration et al. (2013, 2018) 
— batman; Kreidberg (2015) 

— dynesty; Speagle (2020) 

— emcee; Foreman-Mackey et al. (2013) 
— ExoTiC-LD; Wakeford & Grant (2022) 
— ipython; Pérez & Granger (2007) 

— juliet; Espinoza et al. (2019b) 

— matplotlib; Hunter (2007) 

— nestle; Skilling (2004) 

— numpy; Harris et al. (2020) 

— PosEeipon; MacDonald (2023) 

— scipy; Virtanen et al. (2020) 

— spotrod; Béky et al. (2014) 

— supreme-SPOON; Radica et al. (2023) 


DATA AVAILABILITY 


All data used in this study are publicly available from the Barbara 
A. Mikulski Archive for Space Telescopes’ and the Spitzer Heritage 
Archive. 


7 https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal 
-html 
8 https://sha.ipac.caltech.edu/applications/Spitzer/SHA/ 
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APPENDIX A: ADDITIONAL REDUCTION 


We carried out an independent reduction on the HAT-P-18 b SOSS 
TSO using NAMELESS (described in depth in Coulombe et al. 2023). 
We applied all stage 1 steps of the jwst pipeline except for the dark 
current subtraction. We subsequently applied the world coordinate 
system, source type, and flat field steps of stage 2. We proceeded with 
the background subtraction, scaling independently the two regions 
separated by the jump of the model background provided by STScI, 
as described in detail in Lim et al. (2023). Treatment of the 1/f 
noise was performed by scaling each individual column of the trace 
independently and finding the value that results in the lowest Chi- 
square for a given column and integration (Coulombe et al. 2023). 
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Finally, we extracted the 2D spectra using the getSimpleSpectrum 
function of the transitspectroscopy pipeline? with an aperture 
width of 30 pixels. 

We perform the light curve fitting on the extracted spectropho- 
tometric observation in an identical manner to the one described 
in Section 3. We fix the orbital parameters (Tọ, b, a/Rx) to the 
best-fitting values from the order 1 white light curve fit of the 
supreme-SPOON pipeline to ensure consistency. A comparison of 
the best-fitting white light curve parameters for NIRISS/SOSS with 
the two different pipelines is available in Table Al. We note that the 
parameter values of Rp/R«, b and a/R» differ by 1.2-1.3 o. These 
parameters are correlated (e.g., see the corner plot for one of the 
broadband light curve fits; Figure B4), and each reduction may sim- 
ply prefer a slightly different place in the same family of solutions 
with a comparable agreement. The retrieved transmission spectrum, 
along with the one from the reference supreme-SPOON reduction, 
are shown in Figure A1. These transmission spectra obtained through 
two different pipelines are in overall good agreement; showing con- 
sistent transit depths and features. However, the amplitude of the CO2 
and Na features are different between the two reductions (similarly 
to, e.g., Radica et al. 2023) and would lead to some differences in the 
retrieved abundances. 


APPENDIX B: ADDITIONAL MATERIALS 


This paper has been typeset from a TEX/IATRX file prepared by the author. 


9 https: //github.com/nespinoza/transitspectroscopy 
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Table A1. Comparison of best-fitting “white” light curve parameters 
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Rp/Rs b 


Pipeline Spot-crossing to [BJD - 2400000] a/R. qı q2 
NIRISS 
supreme- Masked 59743.353393*t Dono 0.13770 00a 0.398" 09 15.3175 0.207009 0.31005 
1 1 1 01 1 x 1 
SPOON Modelled 59743.353403 Os, «Ooo: 15.32903 0.25100 Oo 
0.00002 +0.0004 +0.013 0.07 0.02 0.06 
NAMELESS Masked 59743.35340+9,00002  0,1371+0,0004 0.38000% 15.421007 he,  0,33+0,08 
+0.0004 +0.019 0.15 

WFC3 = 57430 0.1373+9.0004  0,373+0.019 ieee 0.179179 0.296171 


(Rp/R=)? [ppm] 


Diff [ppm] 


Note: For the HST/WFC3 white light curve fit, the mid-transit time and the two parameters of the quadratic limb darkening law were 
fixed to those values. Note that the impact parameter value for the fit with the spot-crossing modelled is slightly higher due to a diffe- 
rent model being used (spotrod instead of batman). There are value differences for the limb darkening coefficients because of the 

different bandpasses (see text for exact wavelength ranges). 
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Figure A1. Comparison of transmission spectra for HAT-P-18 b obtained with two different pipelines: supreme-SPOON (blue) and NAMELESS (black). Top: 
Transmission spectra are shown binned to a constant resolving power of R = 100 (darker points) and at the pixel resolution (faded points). Bottom: Difference 


between the NAMELESS’ and supreme-SPOON’s pipeline. 
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Figure B1. HST/WFC3 binned spectrophotometric light curves, along with the best-fitting transit models overplotted (black) for both visits. Light curves are 
arbitrarily offset for clarity. Right: Residuals from the different light curve fits with the associated root-mean-square (RMS) scatter. 
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Figure B2. Spitzer/IRAC light curves. Top: Best-fitting transit models (black), overlaid with the systematics-corrected data (red circles), showing channel 1 
(3.6 um) and channel 2 (4.5 um) in the left and right panel, respectively. Bottom: Residuals from the light curve fits. 
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Figure B3. Exposure in the NIRISS/SOSS F277W filter (2.5 < A < 2.85), highlighting the positions of undispersed contaminants. The extraction apertures for 
orders 1 and 2 are denoted in red and blue respectively. Regions of each order that are masked due to the presence of a contaminant are denoted by the black 


dashed lines (see text for exact wavelength ranges that are masked). 
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Figure B4. Posterior probability distributions from the NIRISS/SOSS broadband light curve fit with spotrod. This corresponds to the joint fit of the orbital 


and starspot parameters. 
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Figure B5. Stellar contamination on the transmission spectrum due to the occulted spot. The contamination factor (blue) represents the squared ratio of the 
planet’s apparent radius Rp over the planet’s true radius R,,ọ. The former corresponds to a transit depth computed with a spot-crossing masked and the latter 
with a spot-crossing modelled. The theoretical factor (red) was derived from Eq. 4. 
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